Supernova Cosmology

In this final exercise, you will be analyzing data from Type Ia supernovae to try to constrain fundamental cosmological parameters, namely the portion of the energy in the universe that is composed of matter $\Omega_{m,0}$ and dark energy $\Omega_{\Lambda,0}$.

This notebook will build on all the skills and exercises covered in previously in the course and will require you to code up most of the material from scratch. Just progressing a decent ways through this notebook is already a huge accomplishment given that last week we were covering Python basics.

Objective Review

Remember, the overall goals here are to (1) learn the basics of using Python for data analysis and (2) get comfortable hacking at/debugging code. Proceed at whatever pace is most suitable for you so that you feel you're learning new things and taking away skills that you can apply to coding in the future!

Preamble

As always, we will run the following cell to guarantee compatibility between Python 2 and 3.


In [ ]:
from __future__ import print_function, division
from six.moves import range

%matplotlib inline

Setting up the Problem

Measuring distances in astronomy is Really Hard. In general, there are two ways that are most commonly used:

  • Standard candles: These are objects whose luminosity $L$ we think we know. We can then derive the distance to these objects by measuring the observed flux $F$ via: $$ F = \frac{L}{4\pi d_L^2} ~\Rightarrow~ d_L = \sqrt{\frac{L}{4\pi F}} $$
  • Standard rulers: These are objects whose intrinsic size $r$ (width, radius, etc.) we think we know. We can then derive the distance to these objects by measuring their observed angular size $\theta$ via: $$ \tan \theta \approx \theta = \frac{r}{d_A} ~\Rightarrow~ d_A = \frac{r}{\theta} $$

We will be focusing on the first case using Type Ia Supernovae. These are explosions whose maximum brightness we think we know to pretty good accuracy.

While the above formulae work assuming a static and unchanging universe, they no longer hold for a universe that is expanding. In particular, they are sensitive to the expansion history of the Universe, which is a function of the fraction of the energy density today composed of matter $\Omega_{m,0}$, radiation $\Omega_{r,0}$, and dark energy $\Omega_{\Lambda,0}$. For example, $\Omega_{m,0}=0.5$ means 50% of the energy density of the universe today is matter, while $\Omega_{\Lambda,0}=0.3$ would mean 30% of the energy density today is dark energy.

Note that these can add up to be greater than or less than one. The "extra" (positive or negative) energy curves the spacetime of the entire universe.

This changes our formula a little bit so that we instead have:

$$ d_L(t_{\rm lookback} \,|\, \Omega_{m,0}, \Omega_{r,0}, \Omega_{\Lambda,0}) $$

where the bar $|$ means "conditioned on" (with those parameters given/fixed/known) and $t_{\rm lookback}$ is the time "looking back into the past".

While this might look fancy, it just means that given the energy density of matter, radiation, and dark energy known today, we can "wind back the clock" and predict how far light has traveled from $t_{\rm lookback}$ to today, taking into account the expansion of the universe.

Unfortunately, we don't know how old the universe is when we observe something. We can, however, measure the observed redshift $z$ due to the expansion of the universe. And, fortunately for us, it turns out that there is a relationship between redshift and lookback time:

$$ t_{\rm lookback}(z \,|\, \Omega_{m,0}, \Omega_{r,0}, \Omega_{\Lambda,0}) $$

Substituting in then allows us to write down a relationship between the observed flux $F$ and the luminosity distance $d_L$ given that we know the intrinsic luminosity $L$:

$$ \boxed{ F(z \,|\, \Omega_{m,0}, \Omega_{\Lambda,0}) = \frac{L}{4\pi d_L^2(z \,|\, \Omega_{m,0}, \Omega_{\Lambda,0})} } $$

where we have ignored the dependence on $\Omega_{r,0}$ because it has a negligible impact on our results.

I've defined a function to compute $d_L$ below. It runs using a "cosmology calculator" that solves a couple numerical integrals based on the provided cosmological parameters.


In [ ]:
# function to compute the luminosity distance.
def d_L(zs, Omega_m=0.3, Omega_L=0.7, 
        Omega_r=0.0, H0=100., N=1000, zgrid=None):
    """
    Compute luminosity distance. See `cosmocalc` 
    for information on the other parameters.

    Parameters
    ----------
    zs : array
        Redshift grid.

    zgrid : array, optional
        The redshift grid used to compute the
        baseline relation (which is subsequently interpolated).
        Default is `arange(0, 2.5+1e-5, 0.02)`.

    Returns
    -------
    DL_Mpc : array
        Luminosity distance in Mpc.

    """
    
    if zgrid is None:  # define our default grid if none is provided
        zgrid = np.arange(0, 2.5+1e-5, 0.02)
        
    d_l = np.array([cosmocalc(z, H0=H0, N=N,
                             Omega_m=Omega_m, 
                             Omega_L=Omega_L,
                             Omega_r=Omega_r)[3]  # compute d_L
                 for z in zgrid])
    
    return np.interp(zs, zgrid, d_l)  # interpolate results along `zgrid`

# Define a very simple cosmology calculator.
def cosmocalc(z, Omega_m=0.3, Omega_L=0.7, 
              Omega_r=0.0, H0=70., N=1000):
    """
    Compute cosmological quantities.

    Parameters
    ----------
    z : array
        Redshift.

    Omega_m : float, optional
        Matter density today (in units of the critical density).
        Default `0.3`.

    Omega_L : float, optional
        Dark energy density today (in units of the critical density).
        Default `0.7`.

    Omega_K : float, optional
        Curvature energy density today (in units of the 
        critical density). Default `0.0`.

    Omega_r : float, optional
        Radiation energy density today (in units of the 
        critical density). Default `0.0`.
        
    H0 : float, optional
        Hubble constant today (`z=0`). Default is `70.`.

    N : int, optional
        The number of points used in the integral. 
        Default is `10000`.

    Returns
    -------
    zage_Gyr : float
        Age at z.

    DCMR_Mpc : float
        Comoving distance in Mpc.

    DA_Mpc : float
        Angular diameter distance in Mpc.

    DL_Mpc : float
        Luminosity distance in Mpc.

    V_Gpc : float
        Comoving volume in Gpc^3.

    """

    Omega_K = 1. - Omega_m - Omega_r - Omega_L

    # Initialize constants.
    c = 299792.458  # velocity of light in km/sec
    Tyr = 977.8  # conversion from 1/H to Gyr
    h = H0/100.  # normalized H0
    az = 1.0 / (1. + z)  # scale factor at redshift z

    # Compute age.
    age = 0.
    for i in range(N):
        a = az * (i + 0.5) / N
        adot = np.sqrt(Omega_K + (Omega_m / a) + (Omega_r / a**2)
                       + (Omega_L * a**2))
        age = age + 1. / adot
    zage = az * age / N
    zage_Gyr = (Tyr/H0) * zage

    
    # Compute comoving radial distance.
    DTT = 0.0
    DCMR = 0.0
    for i in range(N):
        a = az + (1 - az) * (i + 0.5) / N
        adot = np.sqrt(Omega_K + (Omega_m / a) + (Omega_r / a**2)
                       + (Omega_L * a**2))
        DTT = DTT + 1. / adot
        DCMR = DCMR + 1. / (a * adot)
    DTT = (1. - az) * DTT / N
    DCMR = (1. - az) * DCMR / N

    # Compute/convert quantities.
    age = DTT + zage  # age [1/H]
    age_Gyr = (Tyr / H0) * age  # age [Gyr]
    DTT_Gyr = (Tyr / H0) * DTT  # travel time [Gyr]
    DCMR_Gyr = (Tyr / H0) * DCMR  # comoving R [Glyr]
    DCMR_Mpc = (c / H0) * DCMR  # comoving R [Mpc]

    # Compute tangential quantities.
    ratio = 1.00
    x = np.sqrt(abs(Omega_K)) * DCMR
    if x > 0.1:
        if Omega_K > 0:
            ratio =  0.5 * (np.exp(x) - np.exp(-x)) / x
        else:
            ratio = np.sin(x) / x
    else:
        y = x**2
        if Omega_K < 0:
            y = -y
        ratio = 1. + y / 6. + y**2 / 120.

    # Compute/convert quantities.
    DCMT = ratio * DCMR  # comoving T [1/H]
    DA = az * DCMT  # angular diameter distance [1/H]
    DA_Mpc = (c / H0) * DA  # DA [Mpc]
    kpc_DA = DA_Mpc / 206.264806  # DA [Mpc / per arcsec]
    DA_Gyr = (Tyr / H0) * DA  # DA [Glyr]
    DL = DA / (az * az)  # luminosity distance [1/H]
    DL_Mpc = (c / H0) * DL  # DL [Mpc]
    DL_Gyr = (Tyr / H0) * DL  # DL [Glyr]

    # Compute comoving volume.
    ratio = 1.00
    x = np.sqrt(abs(Omega_K)) * DCMR
    if x > 0.1:
        if Omega_K > 0:
            ratio = (0.125 * (np.exp(2. * x) - np.exp(-2. * x))
                     - x / 2.) / (x**3 / 3.)
        else:
            ratio = (x / 2. - np.sin(2. * x) / 4.) / (x**3 / 3.)
    else:
        y = x**2
        if Omega_K < 0:
            y = -y
        ratio = 1. + y / 5. + (2. / 105.) * y**2
    VCM = ratio * DCMR**3 / 3  # comoving volume [1/H]
    V_Gpc = 4. * np.pi * (1e-3 * c / H0)**3 * VCM  # convert to Gpc

    return zage_Gyr, DCMR_Mpc, DA_Mpc, DL_Mpc, V_Gpc

Note that these functions have a bunch of arguments that are optional. If you do not provide them, they will be fixed to the pre-specified default.

Part 1: Luminosity Distance and Magnitude

Let's first gain some familiarity with the function above.

Define a grid in redshift from 0 to 1.5 with 1000 points. Then plot the predicted luminosity distance as a function of redshift.

Note that all distances are in units of Megaparsecs (Mpc).


In [ ]:
# define redshift grid
zgrid = ...

# compute luminosity distances
dists = d_L(...)

# plot results
# remember to label your axes!
plt.plot(...)

Now, compute, plot, and label the luminosity distance $d_L(z; \Omega_{m,0}, \Omega_\Lambda)$ for different combinations of $(\Omega_{m,0}, \Omega_\Lambda)$:

  1. $(0.3, 0.0)$,
  2. $(0.3, 0.4)$,
  3. $(0.3, 0.7)$,
  4. $(0.5, 0.7)$,
  5. $(0.8, 0.8)$.

In [ ]:
# compute dists for each combination (1->5)
...

# plot results
# remember to label your axes!
...

As discussed above, what we measure is not actually the luminosity distance directly, but rather a flux $F$. In general, most astronomical measurements are not even reported in flux, but rather in magnitudes, a logarithmic unit of brightness. Assuming we know the intrinsic brightness of a source, we can convert from distance $d_L$ to magnitude $m$ using:

$$ m = M + 5 (\log d + 5) $$

where $M$ is the "absolute magnitude", which is just another way of specifying the luminosity. For Type Ia supernovae, $M=-19.5$.

Convert the 5 $d_L(z \,|\, \Omega_{m,0}, \Omega_{\Lambda,0})$ predictions over zgrid you computed above to the equivalent magnitude predictions using the function below. Then plot and label your results.


In [ ]:
def mag(d, M=-19.5):
    """
    Convert distance(s) `d` into magnitude(s) given 
    absolute magnitude `M`. Assumes `d` is in units of Mpc.
    
    """
    
    return ...

In [ ]:
# convert from dists to mags
...

# plot results
# remember to label your axes!

Part 2: Inspecting Data

We now want to fit the luminosity distance and magnitude predictions to some Type Ia Supernovae data. This is stored under sne1a_data.txt and contains the redshifts, observed magnitudes, and magnitude errors from 50 (mock) supernovae observations.

Using whatever methods you prefer, read in the supernovae data. Then, plot:

  1. a histogram of the redshift distribution of our sample with redshift bins of width $\Delta z = 0.2$ and
  2. the observed magnitudes as a function of redshift.

In [ ]:
# read in data
z_spec, mag_spec, magerr_spec = ...

# plot histogram of redshifts
...

# plot magnitudes vs redshifts
...

Part 3: Chi-Square

We now want to do something similar to the analysis we did in our linear regression exercises: find out how "good" our models fit the data. We can incorporate our measurement errors using the $\chi^2$ function:

$$ \chi^2 = \sum_i \left( \frac{y_{\rm pred} - y_{\rm obs}}{\sigma_{\rm obs}} \right)^2 $$

This should look really similar to our previous loss function, except we just normalize our residuals by the observed errors.

Finish defining the chi2 function below. Then, compute the $\chi^2$ values for each of the 5 $(\Omega_{m,0}, \Omega_{\Lambda,0})$ combinations we computed above and print out the results.


In [ ]:
# Define our chi2 function.
def chi2(pred, obs, err):
    """
    Compute the chi2 between the predictions `pred`
    and the observations `obs` given the observed errors `err`.
    
    """
    
    return ...

In [ ]:
# compute chi2 for each of the combinations
# should follow something like the example shown below
for Om, OL in stuff:
    dpred = d_L(z_spec, ...)
    mpred = mag(...)
    chisquare = chi2(...)
    print('Omega_m, Omega_L = {0}, {1}'.format(Om, OL),
          '; chi2 = ', chisquare)

Part 4: Fitting the Data

We now want to fit the data. We will do this in two parts:

Minimizing $\chi^2$

First, we want to run an analysis similar to that from our previous linear regression exercise where we attempt to find the $(\Omega_{m,0}, \Omega_{\Lambda,0}$ values where our $\chi^2$ is smallest. I've defined a modified function below that matches the format needed to pass into scipy.optimize.minimize.


In [ ]:
# Define our modified function to match the format for `minimize`.
def calc_chi2(theta):
    Om, OL = theta
    dpred = d_L(z_spec, Omega_m=Om, Omega_L=OL)
    mpred = mag(dpred)
    chisquare = chi2(mpred, mag_spec, magerr_spec)
    
    return chisquare

Using scipy.optimize.minimize, find the best-fit combination of $(\Omega_{m,0}, \Omega_{\Lambda,0})$ using the loss function defined above starting from $(0.5, 0.5)$ and subject to the constraints $0 < \Omega_{m,0} < 1$ and $0 < \Omega_{\Lambda,0} < 1$.


In [ ]:
# Find the best fit.
x0 = ...  # initial guess
bounds = ...  # bounds for guesses (see documentation)
results = minimize(calc_chi2, x0, bounds=bounds)  # minimize!
theta_best = results['x']  # get best fit
theta_cov = results['hess_inv'].todense()  # get covariance matrix

# print results
print(theta_best)
print(theta_cov)

Plot a 2-D histogram of 10000 draws from the resulting Multivariate Normal distribution using numpy.random.multivariate_normal. Limit the x and y axes to be from 0 to 1.


In [ ]:
# plot 2-D histogram
# remember to label your axes!

Brute Force

Alternately, we can just brute force the problem by computing $\chi^2$ for tons of possible parameter combinations $(\Omega_{m,0}, \Omega_{\Lambda,0})$, and then plot how our $\chi^2$ changes as $\Omega_{m,0}$ and $\Omega_{\Lambda,0}$ change.

Based on the code below, fill in each element of the chi2_arr with the corresponding $\chi^2$ value.


In [ ]:
# define our grid
Omega_m = arange(0, 1.01, 0.04)  # grid in Omega_m
Omega_L = arange(0, 1.01, 0.04)  # grid in Omega_L

# do our brute-force grid search
chi2_arr = np.zeros((len(Omega_m), len(Omega_L)))
for ...:  # loop over Omega_m
    for ...:  # loop over Omega_L
        # compute chi2
        chisquare = ...
        
        # fill  in array
        chi2_arr[i, j] = chisquare
        
        # Print progress.
        sys.stderr.write('\r Omega_m = {0}, Omega_L = {1}     '
                         .format(Om, OL))

Just like with our galaxy example, we can convert each of our $\chi^2$ values to a corresponding weight that we assign each combination of parameters. This is computed via

$$ w(\chi^2) = e^{-\chi^2/2} $$

Using the formula above, convert the $\chi^2$ values we just computed to a set of weights $w$. Then, plot the resulting 2-D weighted array using plt.imshow based on the code below. Remember to set origin to the right location and extent to ensure you get the right bounds.


In [ ]:
# convert chi2 to weights
weights = ...

# plot results
# remember to label your axes!
plt.imshow(weights.T, origin=..., extent=...)

Summary

Using 50 observations of Type Ia Supernovae, we have placed constraints on the fraction of the energy density of the universe today is composed of matter $\Omega_{m,0}$ and dark energy $\Omega_{\Lambda,0}$. Our results show that our universe is very inconsistent with being mostly matter and is mainly comprised of dark energy!

Solutions

I've included snapshots of what your plots/results should look like below, in case you want to see what your final answers should (sort of) look like.

Part 1

Your final results should look something like the plots below.

So we can see that the differences between each of these cosmologies is not that large!

Part 2

Your final plots should look something like the those shown below.

Part 3

Your results should look the same as the results below.

Part 4

The errors around the best-fit solution look like:

The results we get from our brute-force grid look something like:

Close, but not identical.